Identification of host receptors for viral entry and beyond: a perspective from the spike of SARS-CoV-2

Identification of the interaction between the host membrane receptor and viral receptor-binding domain (RBD) represents a crucial step for understanding viral pathophysiology and for developing drugs against pathogenic viruses. While all membrane receptors and carbohydrate chains could potentially be used as receptors for viruses, prioritized searches focus typically on membrane receptors that are known to have been used by the relatives of the pathogenic virus, e.g., ACE2 used as a receptor for SARS-CoV is a prioritized candidate receptor for SARS-CoV-2. An ideal receptor protein from a viral perspective is one that is highly expressed in epithelial cell surface of mammalian respiratory or digestive tracts, strongly conserved in evolution so many mammalian species can serve as potential hosts, and functionally important so that its expression cannot be readily downregulated by the host in response to the infection. Experimental confirmation of host receptors includes (1) infection studies with cell cultures/tissues/organs with or without candidate receptor expression, (2) experimental determination of protein structure of the complex between the putative viral RDB and the candidate host receptor, and (3) experiments with mutant candidate receptor or homologues of the candidate receptor in other species. Successful identification of the host receptor opens the door for mechanism-based development of candidate drugs and vaccines and facilitates the inference of what other animal species are vulnerable to the viral pathogen. I illustrate these approaches with research on identification of the receptor and co-factors for SARS-CoV-2.


Introduction
Many processes are involved in viral infection, including attachment to host cells, entry of host cells, evasion of host defense mechanisms, viral genome replication, transcription and translation within host cells, viral packaging, lysing host cells and initiating a new infection cycle. Among these processes, attaching to and entering the host cell are often the limiting step requiring the viral pathogen to evolve specific adaptation to the host. Once inside the host cell, the cytoplasmic environment for viral genome replication, transcription and translation are similar across diverse mammalian species.
The need and urgency of identifying the host receptor used by viruses are highlighted by the COVID-19 pandemic. What is the host receptor for the spike protein of SARS-CoV-2 (SARS-2S)? Which part of the SARS-2S binds to which part of the host receptor? What are the amino acid residues that interact between the host receptor and the viral receptor-binding protein? Can drugs be developed to block the binding of SARS-2S to the receptor? Will the drug interfere with the normal function of the receptor and cause a strong side effect? What are the normal functions of the host receptor? Are people who express less of this receptor protein less vulnerable to COVID-19 infection? What other mammalian species have similar host receptors that render them vulnerable to SARS-CoV-2 infection? How well can we predict species vulnerability based on receptor protein similarity in sequence and in structure? Answers to these questions contribute to a good understanding of pathophysiology and epidemiology and provide a foundation for drug development. I illustrate the multiomics approaches to address these questions related to host receptors.

Identification of host receptor and cofactors
Cell membranes are composed of many single-pass transmembrane receptors. Some of them can be internalized into cells upon ligand binding, and could serve as candidate receptors mediating viral attachment and cell entry. However, instead of screening all of them as candidate receptors, existing biological knowledge can speed up the search of host receptors or cofactors that support or enhance viral attachment and cell entry.

Identification of ACE2 as the host receptor
The first genomic sequence of SARS-CoV-2 was obtained on January 5, 2020, and made public on Jan. 11, 2020 . Previously, ACE2 was found to be the host receptor of SARS-S (Li et al., 2003;Kuba et al., 2005). The receptor-binding domain of SARS-S alone can bind to ACE2, leading to its internalization together with the host ACE2 . Because of the similarity in sequence and domain organization between SARS-S and SARS-2S (Zhou P. et al., 2020;Xia, 2021), it is natural to infer that SARS-2S may use the same host receptor ACE2 as SARS-S (Zhou P. et al., 2020).
ACE2 is a typical single-pass transmembrane receptor with a hydrophobic signal peptide of 17 aa and a single hydrophobic transmembrane domain ( Figure 1A). ACE2 is a metallopeptidase with a 5-aa HEMGH zinc-binding motif ( Figure 1A). Several proteases including TMPRSS2 (transmembrane serine protease 2), ADAM-17 (a disintegrin and metalloprotease 17, also known as TACE) and HAT (Human airway trypsin-like protease) can cleave ACE2 at the segment rich in lysine (K) and arginine (R) close to the transmembrane domain ( Figure 1A) to shed enzymatically active soluble ACE2 (sACE2). This KR-rich segment is hydrophilic and consequently disordered, and is missing in the ACE2 structure (1R42, Figure 1B; Towler et al., 2004). What is particularly interesting is that such cleavage of ACE2 by proteases, especially by ADAM-17 (Haga et al., 2008(Haga et al., , 2010Scheller et al., 2011), is activated in SARS-CoV infection. One naturally would think that such cleavage might be a protective response by the host cells, i.e., if membrane-bound ACE2 mediates viral entry, then cleaving them off membrane would decrease infection. Surprisingly, the generation of sACE2 enhances infection (Haga et al., 2008(Haga et al., , 2010. This shows the complexity in pathogen-host interactions that I will discuss in more detail later. Ever since the first characterization of ACE2 (Donoghue et al., 2000;Tipnis et al., 2000), gene expression of ACE2 has been found high in kidney, heart, testis, colon and small intestine, but low in lungs (Hikmet et al., 2020;Li et al., 2020; Figure 1C). This low expression of ACE2 in lungs has motivated the search for alternative receptors and cofactors, until it was found that ACE2 is highly expressed in type II pneumocytes in lungs (Hamming et al., 2004;To and Lo, 2004;Mossel et al., 2008;Xu et al., 2020;Zhao et al., 2020), so the low expression in lungs is due to the mixture of these type II pneumocytes with other types of lung cells that express little ACE2. I should add that the "high" expression of ACE2 in type II pneumocytes is relative to other types of lung cells. The ACE2 expression in type II pneumocytes is still negligibly low relative to other cell types such as those in the digestive system (e.g., enterocytes) or connected to the digestive system (e.g., cholangiocytes), or cells in kidney (e.g., proximal tubular epithelial cells) or in testes (e.g., Sertoli cells), according to data in The Human Protein Atlas (Uhlén et al., 2015).
Interestingly, ACE2 was found to be expressed in oral tissues, especially in tongue (Xu et al., 2020), although the expression is generally low compared to that in the digestive tract. The expression of ACE2 in tongue indicates the potential of destruction of tongue cells upon COVID-19 infection. Whether this might be linked to the loss of taste, a common symptom of COVID-19 infection, has not been explored.
Other candidate receptors that have been used by various coronaviruses include amino peptidase N (APN) and dipeptidyl peptidase 4 (DPP4). However, only cells expressing ACE2 are susceptible to SARS-CoV-2 infection. The presence/absence of APN or DPP4 is irrelevant to SARS-CoV-2 infection (Zhou P. et al., 2020). While ACE2 binds to SARS-S and SARS-2S, DPP4 does not . In particular, ACE2 from mice which is substantially diverged from human ACE2 does not support viral entry (Zhou P. et al., 2020). However, transgenic mice expressing human ACE2 are vulnerable to SARS-CoV-2 and can develop COVID-19 symptoms , suggesting that ACE2 is a sufficient receptor for SARS-CoV-2 attachment and cell entry.
Further corroboration of the interaction between the viral RBD and the host ACE2 comes from microscopy methods and structural characterization. Microscopy methods such as confocal fluorescence microscopy can visualize the binding of coronavirus spike proteins to GFP-tagged ACE2 . Structural studies have characterized not only the structure of SARS-2S monomer and trimmer (Hoffmann et al., 2020;Walls et al., 2020;Wrapp et al., 2020;Yan et al., 2020), but also the SARS-2S trimer and the ACE2 in complex (Gui et al., 2017;Wang et al., 2020;Zhou T. et al., 2020;Xu et al., 2021). What remains to be elucidated is the mechanistic aspects of how the ACE2-binding triggers the transformation of the SARS-2S trimer from the prefusion state to the postfusion state.
These structural studies also provide a list of amino acids in physical contact with each other from the two interacting partners (Lu et al., 2015;Adhikari and Ching, 2020;. The sharing of the interacting amino acids in ACE2 were subsequently used to predict what other mammalian species have an ACE2 that can serve as a host receptor for SARS-CoV-2 infection (Shi et al., 2020;Kruglikov et al., 2021;Wei et al., 2021), which I discuss later. Such information also facilitates the identification of key residues that contribute to the host tropism of SARS-CoV-2. For example, SARS-CoV-2 cannot infect mice because of differences in five key residues in ACE2 between mouse and human. Replacing these residues created a mouse model susceptible to SARS-CoV-2 infection (Adams et al., 2021).
Protein structures also shed light on interactions between the receptor and the viral spike proteins. For example, the structure (7KNB) of human ACE2 in complex with SARS-2S trimer (Zhou T. et al., 2020) shows four segments in ACE2 (19-39, 323-330, 352-357, 385-390) and two segments in  to be in close physical proximity. The amino acids in the four ACE2 segments jointly have an isoelectric point (pI) of 4.38, and those in the two SARS-2S segments jointly have a pI of 9.40. Thus, at neutral pH, the former is negatively charged, but the latter is positively charged. The two therefore would have favorable electrostatic interactions facilitating their binding to each other. This result makes sense of a previous mutation experiment (Adams et al., 2021) to convert the mouse ACE2 that cannot bind to SARS-2S to one that can, based on the sequence difference between human and mouse ACE2. The two mutation constructs (hmACE2.3 and hmACE2.4) that introduced negatively charged amino acid residues present in human ACE2 into mouse ACE2 (N30D in hmACE2.3, and A329E in hmACE2.4) can functionally interact with SARS-2S just as well as human ACE2. The other two constructs (hmACE2.1 and hmACE2.2) introduced mutations to increase hydrophilicity (e.g., H353K in hmACE2.1 and N31K in hmACE2.2), which also improved the interaction of mouse ACE2 with SARS-2S, albeit to a smaller degree than hmACE2.3 and hmACE2.4.
It is important to keep in mind the difference between SARS-CoV and SARS-CoV-2 in their use of ACE2 for cell entry (Xia, 2021). First, there are two documented alternative pathways of cell entry for coronaviruses after receptor binding: (1) cell entry by membrane fusion when the spike trimer is cleaved at the polybasic furin site, and (2) cell entry by clathrin-mediated endocytosis (Inoue et al., 2007) and the endosome-cathepsin pathway (Matsuyama et al., 2005(Matsuyama et al., , 2010. SARS-CoV-2, with the cleaved furin site, uses mainly pathway 1, whereas SARS-CoV uses pathway 2. The inhibition of the clathrin-mediated endocytosis dramatically reduces cell entry by SARS-CoV (Inoue et al., 2007). SARS-CoV cannot use pathway 1 because of the lack of the polybasic furin site. However, a polybasic furin site experimentally introduced into SARS-CoV at the same location as in SARS-CoV-2 created a much more infectious SARS-CoV (Belouzard et al., 2009) with syncytium formation characteristic of SARS-CoV-2 infection. Similarly, SARS-CoV-2 lacking the polybasic furin site are less infective with little syncytium formation (Peacock et al., 2021). Second, SARS-CoV-2 infection is frequently associated with syncytia formation (Daly et al., 2020;Hoffmann et al., 2020;Li et al., 2021) which is rarely reported with SARS-CoV infection. This syncytia formation implies that, once SARS-CoV-2 has entered a cell, it can infect neighboring cells Domains, structure and tissue-specific expression of human ACE2. (A) Hydrophobicity plot generated from DAMBE (Xia, 2018b) based on hydrophobicity values in Kyte and Doolittle (1982) along a sliding window of 40 amino acids. The 17-aa signal peptide at the N-terminus, the 5-aa zincbinding motif, the segment rich in lysine and arginine (KR-rich segment) serving as cleavage sites for TMPRSS2 and HAT proteases (Heurich et al., 2014), and the shaded hydrophobic transmembrane domain are indicated. (B) Structure of ACE2 (1R42) (Towler et al., 2004) with the dashed red line separating the extracellular domain on the left and the intracellular domain on the right. The KR-rich segment is hydrophilic and therefore disordered. It is missing in the structure, so is the transmembrane domain. The 5-aa zinc-binding motif is highlighted within the green-line enclosure. (C) The tissue-specific expression data is extracted from The Human Protein Atlas (Uhlén et al., 2015).
Frontiers in Microbiology 04 frontiersin.org without using ACE2. Thus, high ACE2 abundance in young people than old people (Plaas et al., 2021;Bastolla et al., 2022) renders them more susceptible to SARS-CoV than old people. However, the reduced dependence of SARS-CoV-2 on membrane ACE2 allows SARS-CoV-2 to infect old people who are immunologically weak (Montecino-Rodriguez et al., 2013), even though they do not express a high level of ACE2.

Other candidate receptors and cofactors
The identification of ACE2 as the host receptor does not imply that it is the only host receptor. Several viruses are known to use multiple receptors and co-factors. For example, Dengue virus uses both human mannose-binding receptor (MR) and DC-SIGN on macrophages as primary receptors (Lo et al., 2016), and HIV-1 uses both CD4 as a primary receptor and a cellular coreceptor (Wilen et al., 2012). Are there other receptors or cofactors that facilitate SARS-CoV-2 attachment and cell entry? Existing evidence points to a hypothesis that has not yet been fully explored. SARS-CoV-2 can bind to both membrane-bound ACE2 or soluble sACE2 which could then bind to membrane proteins such as neuropilin-1 (Cantuti-Castelvetri et al., Daly et al., 2020), integrins (Nader et al., 2021;Nader and Kerrigan, 2022), or other membrane proteins to anchor SARS-CoV-2 to host cell membrane.
If SARS-S and SARS-2S use ACE2 as the only host receptor to mediate attachment and cell entry, then SARS-CoV and SARS-CoV-2 should infect the same tissue. However, the two viral lineages differ in their tissue tropism, with SARS-CoV more likely infecting lower respiratory tract than SARS-CoV-2. Two mutations in SARS-2S relative to SARS-S have been hypothesized to contribute to differences in cell tropism between SARS-CoV and SARS-CoV-2, and to involve alternative receptors or cofactors.

Polybasic furin site and NRP1 (neuropilin-1)
One conspicuous difference between SARS-S and SARS-2S is the presence of a polybasic furin site, RRAR^S, in the latter but not in the former (Andersen et al., 2020;Coutard et al., 2020;Hoffmann et al., 2020). Two lines of evidence suggests that this furin site is responsible for the difference in tissue tropism between SARS-CoV and SARS-CoV-2. First, a previous study demonstrated that inserting a polybasic furin site at the S1 and S2 boundary in SARS-S resulted in dramatic changes in cell tropism (Belouzard et al., 2009). Second, the spike protein trimer in SARS-CoV-2 virions is already cleaved at this furin site to prime the fusion between the viral and host membranes (Hoffmann et al., 2020;Xia, 2021), and the C-terminus of the cleaved S1 is accessible for interacting with other proteins (Walls et al., 2020;Wrapp et al., 2020). One may therefore infer that host membrane proteins with structural affinity to the cleaved end of SARS-CoV-2 could serve either as an alternative host receptor or an enhancer of viral infection.
NRP1 is a single-pass membrane protein which is obvious from a hydrophobicity plot ( Figure 2) as it has just a single hydrophobic transmembrane domain. The hydrophobic stretch of 21 amino acids at the N-terminus is the signal peptide ( Figure 2). The a1 and a2 domain are also known as the CUB domains. The b1-b2 domains bind to the furin-cleaved C-terminus of the S1 subunit of SARS-CoV-2 (Daly et al., 2020). The 23-aa segment near the C-terminus of NRP1 is the single-pass transmembrane domain that divides the NRP1 protein into the long extracellular domain and the short 43-aa cytoplasmic domain.
NRP1 is a receptor for other glycoproteins such as VEGF-A and SEMA3A (Plein et al., 2014). Its b1-b2 domain binds specifically to furin-cleaved substrates that has an R/KXXR/K motif at the C terminus where X is any amino acid (Teesalu et al., 2009;Plein et al., 2014). Experimentally determined NRP1 structure shows the negatively charged D320 in NRP-1 interacting electrostatically with the positively charged R/K residue at the C-terminus of the ligand (Guo and Vander Kooi, 2015), and a ligand with the terminal R/K removed may serve as an NRP1 inhibitor. The furin-cleaved S1 subunit of SARS-2S, with the C-terminal RRAR conforming to the R/ KXXR/K motif, binds directly to NRP1 (Daly et al., 2020). The following three experimental studies demonstrated NRP1 to be a cofactor that enhances ACE2-mediated viral attachment and cell entry, although it does not serve a sufficient host receptor for SARS-CoV-2 independent of ACE2 (Cantuti-Castelvetri et al., Daly et al., 2020). First, blocking the binding between the b1-b2 domain and the C-terminus of the viral S1 subunit significantly reduces viral internalization (Cantuti-Castelvetri et al., 2020). Second, removing the RRAR at the C-terminus of the S1 subunit of SARS-2S decreases the binding of S1 to NRP1, and knocking out NRP1 decreases SARS-CoV-2 infection of Hela cells expressing ACE2 (Daly et al., 2020). Third, x-ray crystallography and biochemical approaches revealed that NRP1 enhances internalization of SARS-CoV-2 and syncytia formation (Daly et al., 2020) that has been observed previously to enhance SARS-CoV-2 propagation from cell to cell (Hoffmann et al., 2020).
NRP1 is highly expressed in the olfactory epithelium, and the SARS-CoV-2 infection appears to be correlated with NRP1 expression (Cantuti-Castelvetri et al., 2020). This could explain why SARS-CoV-2 infects predominantly the upper respiratory tracts, in contrast to SARS-CoV that infects lower respiratory tracts and lungs. However, NRP1 is also abundantly expressed in almost all pulmonary tissues (Cantuti-Castelvetri et al., 2020) including lungs (Figure 3), so more studies are needed to check if patients with SARS-CoV-2 infection of lungs also have higher expression of NRP1 in lungs than those without lung infections.

The K403R mutation in SARS-2S, the resulting RGD motif and integrins
Another mutation in SARS-2S relative to SARS-S is K403R (Figure 4) which creates an RGD motif known to be a general integrinbinding motif (Takada et al., 2007). This motif is shared between SARS-CoV-2 and its close relatives isolated from pangolins. The homologous motif in SARS-CoV is KGD (Figure 4). Thus, both SARS-2S and the spike protein from pangolin-isolated SARSr (where r stands for coronaviruses closely related to SARS) are expected to bind to integrins, especially the major endothelial cell integrin, αVβ3 (Nader et al., 2021;Nader and Kerrigan, 2022).
The αV subunit has multiple β partners to form heterodimers, but β3 subunit form heterodimers mainly with αV. For this reason, αV can be highly expressed in tissues without β3 because αV has other β partners, but β3 is expressed mainly in tissues with αV ( Figure 5). Three interesting findings were derived from in-silico molecular simulation (Nader et al., 2021). Firstly, both RGD in SARS-2S and KGD in SARS-S ( Figure 5) are located in a long flexible loop (PDB ID 6M0J for SARS-2S and 5XLR for SARS-S) free to interact with Hydrophobicity plot and domain structure of human neuropilin-1 (NRP1) along a sliding window of 40 amino acids (aa). The b1-b2 domains bind to the furin-cleaved C-terminus of the S1 subunit of SARS-CoV-2. The signal peptide includes the first 21 aa. The 23 aa near the C-terminus of NRP1 constitute the single-pass transmembrane domain that divides the NRP1 protein into the long extracellular domain and the short 43-aa cytoplasmic domain. The hydrophobicity plot was generated from DAMBE (Xia, 2018b) based on hydrophobicity values in Kyte and Doolittle (1982). The domains are not drawn exactly to scale. The numbering of amino acids on the horizontal axis follows the neuropilin-1 isoform X1 annotated on human chromosome 10 (NC_000010). Tissue-specific expression of NRP1 from The Human Protein Atlas (Uhlén et al., 2015), in unit of TPM (transcripts per million). Gene expression in lung and olfactory regions is colored green and red, respectively.
Frontiers in Microbiology 06 frontiersin.org other proteins. However, R is larger in volume than K (124 vs. 119), so RGD in SARS-2S is more solvent exposed than KGD in SARS-S. Secondly, the RGD motif fits nicely into the ligand-binding pocket of the host αVβ3. Thirdly, the RGD motif is located about 32 aa upstream of the receptor-binding domain (RBD) for ACE2, and the interaction between the RGD motif in SARS-2S and the host αVβ3 appears physically independent of the interaction between the viral RBD and the host ACE2. Therefore, the host αVβ3 could serve as an additional receptor for SARS-2S independent of ACE2. Three lines of experimental evidence support the hypothesis that αVβ3 may serve as an alternative receptor (Nader et al., 2021). First, SARS-2S bounds strongly to αVβ3 in vitro. Second, SARS-CoV-2 binds strongly to endothelial cells (which could be due to binding of Phylogenetic tree of spike protein sequences from SARS-CoV-2 and close relatives. OTU names are in the form of accession (GenBank or GISAID) followed by viral strain designation. The protein sequences were aligned by MAFFT (Katoh and Toh, 2008) with the accurate but slow L-INS-i option. The unrooted phylogenetic tree was reconstructed with PhyML (Guindon and Gascuel, 2003), with the empirical LG substitution matrix and optimization of topology, branch lengths and rates. Identical sites are colored green. The RGD motif, colored red, is shared between SARS-CoV-2 and a close relative isolated from pangolin, and differ from the KGD motif in SARS-CoV by a conservative K403R mutation. The RGD motif is the binding target of the major endothelial cell integrin, αVβ3. Hydrophobicity plot and gene expression for the two subunits (αV and β3) of the major endothelial integrin αVβ3. (A,B) Hydrophobicity plot for αV and β3, respectively, with gene names ITGAV and ITGB3, respectively, along a 40-aa window. (C) Gene expression for αV and β3 in different tissues extracted from The Human Protein Atlas (Uhlén et al., 2015), in unit of TPM (transcripts per million). At the N-terminus is the hydrophobic signal peptide. The hydrophobic transmembrane domain is shaded.
Frontiers in Microbiology 07 frontiersin.org SARS-2S to ACE2, αVβ3, or any other potential receptors). Third, the binding between SARS-CoV-2 and endothelial cells can be inhibited by Cilengitide (a specific αVβ3 antagonist), which suggests that the binding between SARS-CoV-2 and endothelial cells is mediated by αVβ3. However, the consequence of the binding between SARS-CoV-2 and endothelial cells is not clear. The binding could lead to cell entry of SARS-CoV-2, which would qualify αVβ3 as an alternative receptor. The binding could also interfere with the normal function of αVβ3 which participates in many cellular processes including angiogenesis, cell adhesion and migration, and signaling (Takada et al., 2007), leading to loss of vascular barrier integrity and consequently enhance SARS-CoV-2 infection and increase the severity of COVID-19 (Nader and Kerrigan, 2022). One may argue that the expression of αVβ3 mainly in endothelial cells would limit its availability for SARS-CoV-2 infection, i.e., SARS-CoV-2 would need to first infect epithelial cells and then traverse to endothelial cells to access αVβ3 as a host receptor. In contrast, ACE2 is expressed in both epithelial and endothelial cells (Hamming et al., 2004). However, the epithelial cells and the endothelial cells are separated by only a very thin basement membrane in lungs. There are also integrins that are expressed in epithelial cells. For example, α5β1, which also binds to the RGD motif, is expressed in a variety of cells including epithelial cells in digestive tract (Sheppard, 1996). Human α5β1 was also implicated in SARS-CoV-2 infection (Robles et al., 2022), and inhibition of human α5β1 by its inhibitor ATN-161 has been shown to reduce viral load in k18-hACE2 transgenic mice infected with SARS-CoV-2 (Amruta et al., 2021;Beddingfield et al., 2021).
How important the RGD motif is in binding to αVβ3 or α5β1 in terms of sequence context could be investigated by either mutating the KGD motif in SARS-S to RGD or investigating the binding properties of the RGD-containing spike protein of the pangolin-derived SARSr. The latter can be done by in-silico protein docking and dynamic modelling. Whether the RGD motif binds to integrins as hypothesized above remains controversial (Zech et al., 2021;Othman et al., 2022). Structural modelling of molecular dynamics suggests that the RGD motif does not bind to integrin (Othman et al., 2022).
The R residue in the RGD motif is experimentally shown to enhance binding of the virus to human cells and subsequent viral entry into the cell (Zech et al., 2021). In the bat-derived virus RaTG31, the closest relative of SARS-CoV-2, the site homologous to R403 is T403 (Figure 4). The spike protein of RaTG13 is weak in binding to human ACE2 relative to SARS-2S (Li et al., 2021). Replacing T403 in RaTG13 by R403 enhances viral receptor binding and viral entry into human cells (Zech et al., 2021). Similarly, replacing R403 in SARS-2S by T403 reduces the viral binding and cell entry (Zech et al., 2021). However, R403 was interpreted to enhance the binding between SARS-2S and ACE2, especially between positively charged R403 in SARS-2S and negatively charged E37 in ACE2, but not between SARS-2S and integrin (Zech et al., 2021). This interpretation, based only on structural modelling (Zech et al., 2021), is probably tenuous. The experimentally determined structure (7KNB) of human ACE2 in complex with the SARS-2S trimer (Zhou T. et al., 2020) shows that R403 in SARS-2S and E37 in ACE2 are not close physically ( Figure 6A). Of the three R403 residues, one in each of the SARS-2S monomers, the closest distance between E37 and R403 is 11.41 Å apart ( Figure 6A). I should mention that there are many different formulations of inter-residue distances. The first (and the simplest) is the distance between the alpha-carbon in one residue and the alphacarbon in the other residue. This tends to be the most stable across different experimentally determined structures, and is the distance in Figure 6A. The second is to first calculate the centroid for each amino acid, and then calculate the distance between the two centroids. The third is to compute the centroid of the interacting functional groups, e.g., the amino group in Lys and the carboxyl group in glutamate, and then compute the distance between the two centroids.
Instead of interaction between R403 and E37, the structure (7KNB) shows electrostatic interaction between positively charged R403 side chain and the negatively charged D405 side chain ( Figure 6B), the two being 4.15 Å apart. The force of attraction between oppositely charged side chains decreases with d 2 (where d is the distance between the interacting partners), so the electrostatic interaction between R403 and D405 within SARS-2S should be much stronger than that between R403 in SARS-2S and E37 in human ACE2. The structural relationship among residues appears consistent across different structural experiments. For example, when human ACE2 is in complex with a SARS-2S monomer instead of a trimer, the distance between R403 in SARS-2S and E37 in ACE2 is 11.15 Å (negligibly smaller than the previous 11.41 Å). Thus, the structure does not suggest a strong interaction between the RGD motif and ACE2, so the RGD motif is free to interact with others, including integrins.
However, there could be an indirect interaction between SARS-2S and integrin through soluble ACE2 (sACE2) as follows. Membrane proteins ADAM-17 and TMPRSS2 cleave the extracellular domain of ACE2 generating sACE2 (Donoghue et al., 2000;Kuba et al., 2010;Scheller et al., 2011;Heurich et al., 2014). sACE2, which features its own RGD motif at sites 204-206, can bind to integrins either in an RGD-dependent or an RGD-independent manner (Clarke et al., 2012). SARS-2S could first bind to sACE2 and then brought close to integrin through sACE2-integrin binding. This is consistent with the observation that shedding of ACE2 results in increased uptake of SARS-CoV virions into host cells (Haga et al., 2008(Haga et al., , 2010Heurich et al., 2014).
There has been insufficient exploration of the functional consequence of the K403R change. Lysine acetylation occurs in both nucleus and cytoplasm (Sadoul et al., 2011;Mu et al., 2020) and removes the positive charge of the lysine residue. Because the lysine in the KGD motif in SARS-S is located in a long flexible loop, it could be acetylated and lose its potential to interact electrostatically with a negative amino acid residue. In contrast, R403 will always be positively charged under normal cellular or tissue pH. It is consequently important to know if K403 in SARS-S is acetylated during virion assembly.

Other candidate receptors and cofactors requiring further empirical confirmation
It has also been suggested that kidney injury molecule-1 (KIM1) may serve as an alternative host receptor for SARS-S and SARS-2S (Yang et al., 2021). However, the evidence is not strong, and the argument that ACE2 alone cannot explain the kidney impairment associated with COVID-19 infection is weak. ACE2 is more highly expressed in kidney than in lung based on tissue-specific expression of protein-coding genes (Fagerberg et al., 2014;Uhlén et al., 2015), as well as on ACE2 activity assays in diabetic mice (Wysocki et al., 2006), which seems sufficient to explain the susceptibility of kidneys to COVID-19 infection without any need to invoke alternative receptors.
Frontiers in Microbiology 08 frontiersin.org Kidney impairment associated with COVID-19 can be explained by the impairment of ACE2 function in degrading Ang II ( Figure 7A). ACE2 protects kidneys from unchecked RAS responses including hypertension, inflammation and tissue damage (Kuba et al., 2010;Soler et al., 2013). COVID-19 infection in kidney destroys kidney cells expressing ACE2 and exposes the kidneys to unchecked RAS responses causing kidney impairment. Two other membrane proteins, AXL and CD147, may deserve attention. AXL is a putative cell receptor for Zika virus (Nowakowski et al., 2016), and implicated in mediating cell entry via the endosome pathway by SRAR-CoV-2 (Bohan et al., 2021). CD147 is a membrane glycoprotein known to be involved in infection by eukaryotic, prokaryotic and viral pathogens (Fenizia et al., 2021), and may also bind to SARS-2S and mediate the cell entry of SARS-CoV-2 through endocytosis (Brodowski et al., 2022;Cavezzi et al., 2022;Kalejaiye et al., 2022), although infection mediated by CD147 is most likely secondary because CD147 is highly expressed in neural tissues but not in respiratory tract (Qiao et al., 2020). Both genes appear weakly expressed in lungs based on tissue-specific gene expression data in The Human Protein Atlas (Uhlén et al., 2015). However, as I mentioned before, ACE2 is highly expressed in type II pneumocytes in lungs (Hamming et al., 2004;To and Lo, 2004;Mossel et al., 2008;Xu et al., 2020;Zhao et al., 2020), so the low tissue-specific expression of these two genes may not exclude the possibility of high expression in certain types of cells.
Another cofactor proposed to bind SARS-2S and facilitate SARS-CoV-2 cell entry is sialic acid-containing glycolipids (Nguyen et al., 2022). Depletion of these glycolipids decreases SARS-CoV-2 infection. However, the depletion of these glycolipids could have multiple consequences. It may impair membrane integrity and render epithelial cells more exposed. For example, mucins are important membrane component, and the loss of mucins enhances SARS-CoV-2 infection (Biering et al., 2022). Also, loss of membrane integrity may cause shedding of ACE2 and candidate cofactors such as NRP1 and consequently generate outcomes that are difficult to interpret.
The hypothesis of sialic acid-containing glycolipids or sialylated glycans as a receptor for SARS-2S is vague because many membrane proteins are sialylated glycoproteins, including ACE2 and CD147. However, ACE2 glycan processing has little effect on SARS-CoV-2 recognition (Allen et al., 2021). One may therefore infer that it is features other than sialylated glycans that is important in mediating SARS-CoV-2 infection.

Host receptors and cofactors as drug targets?
ACE2 has been studied as a drug target ever since it was identified as the host receptor for SARS-CoV. The rationale seems straightforward. Given that ACE2 is a gate to let SARS-CoV-2 into the Protein structure of human ACE2 in complex with the SARS-2S trimer (PDB accession 7KNB) (Zhou T. et al., 2020). (A) The distance from the negatively charged E37 in ACE2 to the three positively charged R403 residues, one in in each of the three SARS-2S monomers. (B) R403 and D405 within SARS-2S, which are 4.15 Å apart, interact with each other electrostatically. There are no direct experiments on ACE2 abundance and the risk to COVID-19. Consequently, an indirect approach has been used to address the question. COVID-19 symptoms are more severe in old-age group (OG) than the young or middle age group (YG). ACE2 abundance was obtained from different age groups to establish the relationship between ACE2 abundance and age. If OG expresses more ACE2 than YG, then ACE2 abundance is a likely contributor to the severity of COVID-19 in OG.

Frontiers in
In a well-planned comparative study among groups of different ages and ACE2 expression in a hospital cohort (Plaas et al., 2021), ACE2 expression is higher in YG than in OG. Similarly, careful and structured meta-analysis also supports higher ACE2 in YG than in OG (Bastolla et al., 2022). However, large-scale compilation of data sometimes leads to contradictory results (Zheng, 2022). Some of the discrepancy could be explained by pooling unbalanced data. For example, if ACE2 expression levels in YG and OG in region 1 are ACE YG r 2 2 0 These two mean values would mislead us to conclude that ACE2 expression is higher in OG than in YG. This Simpson paradox, typically illustrated with the data from surgery on kidney stone data (Xia, 2018a), is often forgotten in large-scale data compilations.
The observation that OG has lower ACE2 abundance than YG (Plaas et al., 2021;Bastolla et al., 2022) seems incompatible with the observation that OG suffers more from COVID-19 than YG. There are two explanations. First, although ACE2 is higher in YG than in OG, the level of ACE2 in OG is still sufficient for initiating SARS-CoV-2 infection. Second, as I mentioned before, SARS-CoV-2 infection is associated with syncytia formation. This means that, once SARS-CoV-2 infected a cell, the spread of SARS-CoV-2 from this infected cell to neighboring uninfected cell may not need ACE2. In contrast to SARS-CoV-2, SARS-CoV infection does not form syncytia, so infection of new cells requires ACE2. Because ACE2 is more abundant in YG than in OG, people in YG tend to have higher risk to SARS-CoV than those in OG, which is consistent with SARS epidemiological data. I should emphasize that previous studies quantifying ACE2 expression does not take into consideration the sACE2 (the soluble portion of ACE2), so one should be cautious in interpreting ACE2 abundance and COVID-19 risk in different age groups.
3.2. Targeting ACE2 to reduce the COVID-19 risk without impacting ACE2 function ACE2 has multiple functions (Fyhrquist and Saijonmaa, 2008;Kuba et al., 2010), but its most well-documented function is to buffer the RAS (renin-angiotensin system) effect for blood pressure homeostasis ( Figure 7A). Human liver produces the 485-aa angiotensinogen which, after cleaving the 33-aa N-terminal signal The RAS (renin-angiotensin system) and the exploitation of ACE2 by SARS-CoV-2 as a cell receptor. (A) The RAS system in maintaining blood pressure homeostasis through negative feedback. Low blood pressure triggers the release of renin which cleaves angiotensinogen (colored green) to produce Ang I; ACE cleaves Ang I to generate Ang II which binds to AT1R receptor to increase blood pressure; ACE2 degrade Ang II to prevent prolonged hypertension. ADAM17 cleaves the extracellular domain of ACE2 to generate soluble ACE2 (sACE2) which is also enzymatically active in degrading Ang II. (B) SARS-CoV-2 could anchor itself to the cell membrane by binding either to ACE2 or to sACE2 through other membrane proteins such as integrin (not to scale).
Frontiers in Microbiology 10 frontiersin.org peptide (Kumar et al., 2011), is released as the 452-aa mature circulating angiotensinogen ( Figure 7A). When blood pressure falls, renin released from kidney cells converts angiotensinogen to Ang I, with cleavage between 10 L and 11 V (Yan et al., 2019). Ang I is in turn converted by ACE, a peptidyl dipeptidase, to Ang II ( Figure 7A). Ang II interacts with the two receptors, but mainly through receptor AT1R ( Figure 7A), to increase the blood pressure by (1) increasing the blood volume and (2) shrinking the blood vessel (vasoconstriction). This RAS function, if unchecked, would lead to hypertension, inflammation, tissue damage, heart failure, and other cardiovascular abnormalities (Kuba et al., 2010). Carboxypeptidase ACE2 takes short oligopeptides such as peptide hormones and cleaves efficiently at the Pro^X junction (where X is a hydrophobic amino acid at the C-terminus) (Donoghue et al., 2000;Tipnis et al., 2000;Dales et al., 2002). This reduction in Ang II, together with the binding of the resulting Ang1-7 to MAS receptors ( Figure 7A), buffers the RAS effect to maintain blood pressure homeostasis. People with low levels of ACE2 tend to have high level of Ang II and hypertension, and need to be treated with ACE inhibitors so that Ang I is not converted to Ang II Kuba et al., 2010). Alternatively, one may use drugs such as griseofulvin (a known vasodilator) which decreases blood pressure (Rubin, 1963;Aldinger, 1968). A recent study suggests that the griseofulvin effect may be mediated by its binding to ACE2 (Aris et al., 2022), i.e., griseofulvin may be an ACE2 enhancer.
Oligopeptides with His^X at the C-terminus can also serve as substrates for ACE2, although the cleavage is not as efficient as Pro^X (Dales et al., 2002). ACE2 can therefore cleave the terminal leucine in Ang I ( Figure 7A). However, Ang I also inhibits ACE2 activity (Dales et al., 2002; Figure 7A), which is essential for the accumulation of Ang II. High levels of Ang I indicates weak activity of ACE and weak RAS effect, so ACE2 should be at low activity as well. When Ang I is converted to Ang II, the inhibitory effect of Ang I on ACE2 is removed, and the active ACE2 clears Ang II to prevent hypertension.
Ignoring the sACE2 activity in degrading Ang II may lead to misunderstanding of the negative feedback regulation of the RAS system. For example, an increase in Ang II level was associated with a decrease in myocardial ACE2 protein level (Patel et al., 2014). If one takes the decreased myocardial ACE2 protein level as decreased ACE2 activity, then one would conclude that an increase in Ang II, instead of increasing the ACE2 activity to degrade Ang II, actually decreases the ACE2 activity. This would imply a prolonged high concentration of Ang II because such a high Ang II concentration would seem to decrease ACE2 that degrade Ang II, so an increase in Ang II would lead to further increase in Ang II. However, the observed decrease in myocardial ACE2 may not imply decreased ACE2 activity because such decrease in myocardial ACE2 may be associated with an increase in sACE2 ( Figure 7A). Because sACE2 is also enzymatically active in degrading Ang II (Kuba et al., 2010;Patel et al., 2014), the total ACE2 activity may not be decreased even though the myocardial ACE2 level is decreased. One needs to measure total ACE2 activity, including both the membrane-bound ACE2 and sACE2, in converting Ang II to Ang1-7 ( Figure 7A).
Given the essential function of ACE2, simply downregulating ACE2 expression to reduce SARS-CoV-2 infection (Brevini et al., 2023) may incur the side effect of insufficient ACE2 activity. However, low ACE2 activity could be compensated by inhibitors of ACE such as MLN4760 (Dales et al., 2002) or angiotensin-receptor blocker (ARB) (Kuba et al., 2010;Bosso et al., 2020) or vasodilators such as griseofulvin (Aris et al., 2022). The binding site between ACE2 and the RBD of SARS-S and SARS-2S (Gui et al., 2017;Wang et al., 2020;Zhou T. et al., 2020;Xu et al., 2021) does not cover the zinc-binding metallopeptidase domain of ACE2 (Kuba et al., 2010). Therefore, it is theoretically possible to develop a drug that would interfere with the binding between the host ACE2 and the viral RBD without affecting ACE2's function in converting Ang II to Ang 1-7. Many drugscreening studies check only binding affinity between a candidate drug and human ACE2 (Mathew et al., 2021;Aris et al., 2022). A reasonable drug candidate should bind to the site of interaction between SARS-2S and ACE2, but does not bind to the zinc-binding catalytic site of ACE2.
The same principle of reducing infection without impacting function should be applied not only to host receptors such as ACE2, but also other cofactors such as NRP1 (Cantuti-Castelvetri et al., 2020;Daly et al., 2020) because a proper level of NRP1 protein is essential for cardiovascular and neuronal development (Guo and Vander Kooi, 2015). Overexpression of the gene (Kawasaki et al., 1999), or knock-out of the gene (Kitsukawa et al., 1995) are both lethal in mice.
The function implication of sACE2 remains elusive. The cleavage of ACE2 by ADAM-17 (Haga et al., 2008(Haga et al., , 2010Scheller et al., 2011) is activated in SARS-CoV infection, generating sACE2 ( Figure 7A). This could be either a host-mediated protection response or a virusmediated response to colonize cells not expressing ACE2. If membrane-bound ACE2 mediates viral entry, then cleaving them off membrane would protect the ACE2-expressing cell from infection. However, this hypothesis of host-mediated protection response is contradicted by the observation that the generation of sACE2 enhances infection (Haga et al., 2008(Haga et al., , 2010. It is possible that SARS-CoV-2 can bind to both membrane-bound ACE2 and sACE2 which could then bind to membrane proteins such as neuropilin-1 (Cantuti-Castelvetri et al., Daly et al., 2020), integrins (Nader et al., 2021;Nader and Kerrigan, 2022), or other membrane proteins to anchor SARS-CoV-2 to host cell membrane that do not have ACE2 ( Figure 7B). SARS-2S could first bind to sACE2 which then binds to membrane integrin ( Figure 7B). This is consistent with the observation that the shedding of ACE2 results in increased uptake of SARS-CoV virions into host cells, and therefore supports the alternative hypothesis of virus-mediated response to colonize host cells that do not express ACE2.

Predicting mammalian species susceptible to SARS-CoV-2
Many studies have used similarity in ACE2 sequences and sharing of interacting amino acids between ACE2 and the viral RBD to predict vulnerability of other mammalian species to COVID-19 (Damas et al., 2020;Shi et al., 2020;Kruglikov et al., 2021;Wei et al., 2021). The general rationale is that the ACE2 of a mammalian species highly similar to the ACE2 of susceptible species (e.g., human) would serve as a host receptor for SARS-CoV-2 and predispose the species to SARS-CoV-2 infection. SARS-2S can use ACE2 in many species for cell entry (Hossain et al., 2020;Shi et al., 2020;Zhai et al., 2020;Li et al., 2021), including all tested primate species, pangolins, and several carnivorous species.
Two different approaches have been used for the prediction. The first and the simplest index of vulnerability is based on phylogenetic Frontiers in Microbiology 11 frontiersin.org analysis of aligned ACE2 sequences (Figure 8). Species with a short root-to-tip distance have relatively conserved ACE2, and these species, colored in red (Figure 8), tend to be susceptible to SARS-CoV-2 infection. Within rodents, the golden hamster (Mesocricetus auratus) can be infected by both SARS-CoV and SARS-CoV-2, and its ACE2 is closer to the putative root than mouse and rat that are not vulnerable unless humanized with human ACE2. Within Chiroptera, Rhinolophus affinis is more vulnerable than other bat species (Li et al., 2021) and its ACE2 is closer to the putative root than others (Figure 8). Within primates, human ACE2 is closer to the putative root than other primates, and humans appear to be more vulnerable to COVID-19 infection than other primates. An ideal receptor protein from a viral perspective is one that is (1) highly expressed in epithelial cell surface of mammalian respiratory or digestive tracts, (2) functionally important so that its expression cannot be readily downregulated by the host in response to the infection, and (3) strongly conserved in evolution so many mammalian species can serve as potential hosts. We have shown previously that ACE2 meets the first two criteria. Figure 8 shows that ACE2 also meets the last criterion.
The second approach for predicting species vulnerability incorporates information from protein structures. The characterization of the structure of SARS-2S (Hoffmann et al., 2020;Walls et al., 2020;Wrapp et al., 2020;Yan et al., 2020), especially those with the SARS-2S trimer in complex with ACE2 (Gui et al., 2017;Wang et al., 2020;Zhou T. et al., 2020;Xu et al., 2021), provides a list of amino acids in physical contact between ACE2 and SARS-2S (Lu et al., 2015;Adhikari and Ching, 2020). The sharing of the interacting amino acids in ACE2 have been used to predict which mammalian species have an ACE2 that can serve as a host receptor for SARS-CoV-2 infection (Shi et al., 2020;Zhai et al., 2020;Kruglikov et al., 2021;Wei et al., 2021).
One stretch of five amino acids in human ACE2 (hACE2), 353KGDFR357 (Figure 9), is particularly worth of highlighting because (1) the amino acid composition in the 5-aa motif implies that it is highly hydrophilic and should stay on the surface of the protein, (2) it is in close physical contact with the 500TNGVGY505 segment in SARS-2S (Figure 9), based on the structure 6M0J (Lan et al., 2020), and (3) it is highly conserved, except 354G, across representative species in Carnivora, Artiodactyla and Chiroptera . A highly conserved hACE2 binding motif means that SARS-2S can not only infect all hACE2, but also ACE2 in a variety of mammalian species. This would generate an unusually large array of potential host species and a consequently large viral reservoir in nature. In contrast, if the interacting motif in hACE2 binding to SARS-2S were highly variable even among different human populations, then SARS-CoV-2 would only be able to infect humans or just a specific genetically homogeneous human population.
Among rodents, the mouse and rat ACE2 that cannot serve as a SARS-2S receptor has 353H. One may infer that an H353K mutation would change the mouse ACE2 to a SARS-2S receptor. Such a mutation has been carried out both in nature and by virologists. The golden hamster (Mesocricetus auratus) has 353 K and is susceptible to SARS-CoV-2 infection. Experimental introduction of a H353K mutation into mouse ACE2, i.e., hmACE2.1 in Adams et al. (2021), substantially improved the function of mouse ACE2 as a SARS-2S ACE2 Phylogeny of representative mammalian species in Chiroptera (bats), Pholidota (pangolins), Carnivora (felids and canids), Primates, and Rodentia. The protein sequences were aligned by MAFFT (Katoh and Toh, 2008) with the accurate but slow L-INS-i option. The unrooted phylogenetic tree was reconstructed with PhyML (Guindon and Gascuel, 2003), with the empirical LG substitution matrix and optimization of topology, branch lengths and rates. The reconstructed tree is unrooted but is rooted by mid-point. The species in red have been empirically shown to be vulnerable to SARS-CoV-2 infection. They are closer to (have fewer substitutions in ACE2 from) the putative common ancestor (indicated by a black solid circle) than species not vulnerable to SARS-CoV-2 infection. Microbiology  12 frontiersin.org receptor. There should be more coevolutionary studies between hosts and pathogens. Prediction of species vulnerability based on ACE2 alone is confounded by many factors. First, successful viral infection involves multiple steps including cell attachment, cell entry, evasion of host immune systems, viral genome replication, transcription, translation, packaging of virions, and cell lysis and viral release. Having a suitable ACE2 receptor represents just one of these steps. For example, pig ACE2 appears to serve as a good receptor for SARS-2S (Li et al., 2021), but SARS-CoV-2 does not infect pigs. Second, an ACE2 in an animal highly similar to human ACE2 may express little ACE2 in respiratory tract. For example, in contrast to humans and other primates, dogs express relatively little ACE2 in respiratory tract but high ACE2 in digestive tract (Naqvi et al., 2019;Zhai et al., 2020), so it is not surprising to find positive rectal swabs but not in pharyngeal swabs in experimental dogs a few days after the inoculation with SARS-CoV-2 (Shi et al., 2020). Thus, a prediction that SARS-CoV-2 would cause respiratory diseases in dogs because dogs have an ACE2 similar to human ACE2 is not quite true because of the low expression of ACE2 in the respiratory tract of dogs.

Frontiers in
In summary, host receptor identification and related studies require a multidisciplinary approach involving diverse types of data and integrative data analyses. This review may contribute to the design of training programs for future virologists.

Author contributions
The author confirms being the sole contributor of this work and has approved it for publication.

Funding
This research was funded by a Discovery Grant from the Natural Science and Engineering Research Council (NSERC, RGPIN/2018-03878) of Canada. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.